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Abstract 

We show that with high probability a random subset of {l,...,n} of size ®{n l ~ l l k ) 
contains two elements a and a + d k , where d is a positive integer. As a consequence, we 
prove an analogue of the Sarkozy-Furstenberg theorem for a random subset of {1, . . . , n}. 



1. Introduction 

Let p be a general additive configuration, p — (a, a+Px(d), . . . , a + Pk-i(d)), where Pi 6 Z[d] 
and -Pj(O) = 0. Let [n] denote the set of positive integers up to n. A natural question is: 

Question 1.1. How is p distributed in [n]? 

Roth's theorem [5] says that for 5 > and sufficiently large n, any subset of [n] of size Sn 
contains a nontrivial instance of p = (a, a + d, a + 2d) (here nontrivial means d ^ 0). In 1975, 
Szemeredi [8] extended Roth's theorem for general linear configurations p — (a, a + d, . . . ,a + 
{k — l)d). For a configuration of type p — (a, a + P(d)), Sarkozy [7] and Fiirstenberg [2] 
independently discovered a similar phenomenon. 

Theorem 1.2 (Sarkozy-Furstenberg theorem, quantitative version). [HI Theorem 3.2],[U 
Theorem 3.1] Let S be a fixed positive real number, and let P be a polynomial of integer 
coefficients satisfying P(0) = 0. Then there exists an integer n = n(5, P) and a positive 
constant c(5, P) with the following property. If n > n(5, P) and A C [n] is any subset of 
cardinality at least 5n, then 

• A contains a nontrivial instance of p. 

• A contains at least c(5, P)\A\ 2 n 1 ^ dcg( - p ^~ 1 instances of p = (a, a + P(d)). 
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In 1996, Bergelson and Leibman pQ extended this result for all configurations p — (a,a + 
Pi(d), . . . , Pk-i{d)), where P t G Z[d] and P(0) = for all i. 

Following Question 11.11 one ma Y consider the distribution of p in a "pseudo-random" 

set. 

Question 1.3. Does the set of primes contain a nontrivial instance of p? How is p dis- 
tributed in this set? 

The famous Green- Tao theorem [3] says that any subset of positive upper density of the 
set of primes contains a nontrivial instance of p — (a, a + d, . . . , a + (k — l)d) for any k. This 
phenomenon also holds for more general configurations (a, a + Pi(d), . . . , a + Pk-\{d)) , where 
Pi E Z[d] and P(0) = for all i (cf. [9]). 

The main goal of this note is to consider a similar question. 

Question 1.4. How is p distributed in a typical random subset of [n] ? 

Let p be an additive configuration and let 5 be a fixed positive real number. We say that 
a set A is (5, p)-dense if any subset of cardinality at least S\A\ of A contains a nontrivial 
instance of p. In 1991, Kohayakawa-Luczak-Rodl [5J showed the following result. 

Theorem 1.5. Almost every subset R of [n] of cardinality \R\ = r n 1 ^ 2 is (5, (a, a + 
d,a + 2d))-dense. 

The assumption r ^$>$ n l l 2 is tight, up to a constant factor. Indeed, a typical random 
subset R of [n] of cardinality r contains about 0(r 3 /n) three-term arithmetic progressions. 
Hence, if (l — 8)r ^> r 3 /n, then there is a subset of R of cardinality 8r which does not contain 
any nontrivial 3-term arithmetic progression. 

Motivated by Theorem II .5\ Laba and Hamel [1] studied the distribution of p — (a, a + d h ) 
in a typical random subset of [n] , as follows. 

Theorem 1.6. Let k > 2 be an integer. Then there exists a positive real number e(k) with 
the following property. Let 5 be a fixed positive real number, then almost every subset R of 
[n] of cardinality \R\ — r ^>s n 1 ^ 6 ^ is (5, (a, a + d k )) -dense. 

It was shown that e{2) = 1/110, and £:(3) ^> s(2), etc. Although the method used in 
[1] is strong, it seems to fall short of obtaining relatively good estimates for e(k). On the 
other hand, one can show that e(k) < 1/k. Indeed, a typical random subset of [n] of size 
r contains Q(n l+1 l k r 2 /n 2 ) instances of (a, a + d k ). Thus if (1 — 5)r ^> n 1+1 ^ k r 2 /n 2 (which 
implies r <t^s n l ~ l l k ) then there is a subset of size 5r of R which does not contain any 
nontrivial instance of (a, a + d k ). 



In this note we shall sharpen Theorem 11.61 by showing that e(k) = 1/k. 
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Theorem 1.7 (Main theorem). Almost every subset R of [n] of size \R\ — r ^>s n 1 ^ 1 ^ is 
(S, (a, a + d k ))-dense. 

Our method to prove Theorem 11.71 is elementary. We will invoke a combinatorial lemma 
and the quantitative Sarkozy-Fiirstenberg theorem (Theorem II. 2p . As the reader will see 
later on, the method also works for more general configurations (a, a + P(d)), where P G Z[d] 
and P(0) = 0. 



2. A Combinatorial Lemma 

Let G(X, Y) be a bipartite graph. We denote the number of edges going through X and Y 
by e(X,Y). The average degree d(G) of G is defined to be e(X,Y)/(\X\\Y\). 

Lemma 2.1. Let {G = G([n], [ji])}J£Li be a sequence of bipartite graphs. Assume that for any 
e > there exist an integer n(e) and a number c(e) > such that e(A,A) > c(e)\A\ 2 d(G)/n 
for all n > n(e) and all A C [n] satisfying \A\ > en. Then for any a > there exist an 
integer n(a) and a number C(a) > with the following property. If one chooses a random 
subset S of [n] of cardinality s, then the probability of G(S,S) being empty is at most a s , 
providing that \S\ = s > C (a)n / d(G) and n > n(a). 

Proof. For short we denote the ground set [n] by V. We shall view S as an ordered random 
subset, whose elements will be chosen in order, v\ first and v s last. We shall verify the lemma 
within this probabilistic model. Deduction of the original model follows easily. 

For 1 < k < s — 1, let be the set of neighbors of the first k chosen vertices, i.e., 
Nk = {v G V,(vi,v) G E(G) for some i < k}. Since G(S,S) is empty, we have Vk+i 
N k . Next, let B k+ i be the set of possible choices for t> fc+1 (from V\{vi, . . . , such that 
Nk+i\Nk < c(e)ed(G), where e will be chosen to be small enough (e = a 2 /6 is fine) and c(e) 
is the constant from Lemma 12.11 We observe the following. 

Claim 2.2. \B k+1 \ < e\V\. 

To prove this claim, we assume for contradiction that |-Bfc+i| > s\V\ = en. Since B k+ % n 
N k = 0, we have e(B k+1 ,B k+1 ) < e(B k+1 ,V\N k ) < c{e)ed{G)\B k+1 \ < c{e)\B k+1 \ 2 d{G)/n. 
This contradicts the property of G assumed in Lemma [2. II provided that n is large enough. 

Thus we conclude that if G(S, S) is empty then |-Bfc+i| < e\V\ for 1 < k < s — 1. 

Now let s be sufficiently large, say s > 2(c(e)e)~ 1 n/ d{G) , and assume that the vertices 
Vi,...,v s have been chosen. Let s' be the number of vertices v k+ i that do not belong to 
B k+ i. Then we have 

n>\N s \> \N k+1 \N k \>s'c(e)ed(G). 
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Hence, s' < (c(e)e)- 1 7i/d(G) < s/2. 

As a result, there are s — s' vertices Vk+i that belong to B k+ i. But since |-Bfc+i| < we 
see that the number of subsets S of V such that G(S, S) is empty is bounded by 

( J ™ S 'M S ~ S ' < (6e) s/2 n(n - 1) . . . (n - s + 1) < a s n{n - 1) . . . (n - s + 1), 
thereby completing the proof. □ 



3. Proof of Theorem 11.71 



First, we define a bipartite graph G on [n] x [n] = Vi x V 2 by connecting w 6 to f G V 2 
if f — m = d k for some integer d G [1, n 1 /*]. Notice that d(G) ~ Cn l ^ k for some absolute 
constant C. 

Let us restate the Sarkozy-Fiirstenberg theorem (Theorem II .2\ for P(d) = d k ) in terms 
of the graph G. 

Theorem 3.1. Let e > be a positive constant. Then there exists a positive integer n(e, k) 
and a positive constant c(e, k) such that e(A, A) > c(e, k)\A\ 2 n l l k ~ l for all n > n(e, k) and 
all A C [n] satisfying \A\ > en. 

Now let S be a subset of [n] of size s. We call S bad if it does not contain any nontrivial 
instance of (a, a + d k ). In other words, S is bad if G(S, S) contains no edges. By Lemma 
12.11 and Theorem 13.11 the number of bad subsets of [n] is at most a s (™) , provided that 
s > C(a)n/d(G). This condition is satisfied if we assume that 

s > 2C{a)C- 1 n 1 - 1/k . 

Next, let r = s/5 and consider a random subset R of [n] of size r. The probability that 
R contains a bad subset of size s is at most 




provided that a = at (5) is small enough. 

To finish the proof, we note that if R does not contain any bad subset of size 5r, then R 
is (5, (a, a + d k ))-deiase. 
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